Picture for Rui Yan

Rui Yan

Weaving Context Across Images: Improving Vision-Language Models through Focus-Centric Visual Chains

Add code
Apr 28, 2025
Viaarxiv icon

ShapeSpeak: Body Shape-Aware Textual Alignment for Visible-Infrared Person Re-Identification

Add code
Apr 25, 2025
Viaarxiv icon

Hierarchical Relation-augmented Representation Generalization for Few-shot Action Recognition

Add code
Apr 14, 2025
Viaarxiv icon

VideoExpert: Augmented LLM for Temporal-Sensitive Video Understanding

Add code
Apr 10, 2025
Viaarxiv icon

V-MAGE: A Game Evaluation Framework for Assessing Visual-Centric Capabilities in Multimodal Large Language Models

Add code
Apr 08, 2025
Viaarxiv icon

Scaling Video-Language Models to 10K Frames via Hierarchical Differential Distillation

Add code
Apr 03, 2025
Viaarxiv icon

From 1,000,000 Users to Every User: Scaling Up Personalized Preference for User-level Alignment

Add code
Mar 21, 2025
Viaarxiv icon

MathFusion: Enhancing Mathematic Problem-solving of LLM through Instruction Fusion

Add code
Mar 20, 2025
Viaarxiv icon

Stick to Facts: Towards Fidelity-oriented Product Description Generation

Add code
Mar 12, 2025
Viaarxiv icon

AA-CLIP: Enhancing Zero-shot Anomaly Detection via Anomaly-Aware CLIP

Add code
Mar 09, 2025
Viaarxiv icon